Initial Cluster Analysis

نویسندگان

  • Stephen F. Altschul
  • Andrew F. Neuwald
چکیده

We study a simple abstract problem motivated by a variety of applications in protein sequence analysis. Consider a string of 0s and 1s of length L, and containing D 1s. If we believe that some or all of the 1s may be clustered near the start of the sequence, which subset is the most significantly so clustered, and how significant is this clustering? We approach this question using the minimum description length principle and illustrate its application by analyzing residues that distinguish translational initiation and elongation factor guanosine triphosphatases (GTPases) from other P-loop GTPases. Within a structure of yeast elongation factor 1[Formula: see text], these residues form a significant cluster centered on a region implicated in guanine nucleotide exchange. Various biomedical questions may be cast as the abstract problem considered here.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Fuzzy C-means Clustering Algorithm Based on Sample Density

Fuzzy clustering techniques, especially fuzzy c-means (FCM) clustering algorithm, have been widely used in automated image segmentation. The performance of the FCM algorithm depends on the selection of initial cluster center and/or the initial memberships value. if a good initial cluster center that is close to the actual final cluster center can be found. the FCM algorithm will converge very q...

متن کامل

Modified K-Means for Better Initial Cluster Centres

The k-means clustering algorithm is most popularly used in data mining for real world applications. The efficiency and performance of the k-means algorithm is greatly affected by initial cluster centers as different initial cluster centers often lead to different clustering. In this paper, we propose a modified k-means algorithm which has additional steps for selecting better cluster centers. W...

متن کامل

An Improved PSO Clustering Algorithm Based on Affinity Propagation

-Particle swarm optimization (PSO) is undoubtedly one of the most widely used swarm intelligence algorithm. Generally, each particle is assigned an initial value randomly. In this paper an improved PSO clustering algorithm based on affinity propagation (APPSO) is proposed which provides new ideas and methods for cluster analysis. Firstly the proposed algorithm get initial cluster centers by aff...

متن کامل

A Clustering Approach by SSPCO Optimization Algorithm Based on Chaotic Initial Population

Assigning a set of objects to groups such that objects in one group or cluster are more similar to each other than the other clusters’ objects is the main task of clustering analysis. SSPCO optimization algorithm is anew optimization algorithm that is inspired by the behavior of a type of bird called see-see partridge. One of the things that smart algorithms are applied to solve is the problem ...

متن کامل

Cluster center initialization algorithm for K-means clustering

Performance of iterative clustering algorithms which converges to numerous local minima depend highly on initial cluster centers. Generally initial cluster centers are selected randomly. In this paper we propose an algorithm to compute initial cluster centers for K-means clustering. This algorithm is based on two observations that some of the patterns are very similar to each other and that is ...

متن کامل

Computing Initial points using Density Based Multiscale Data Condensation for Clustering Categorical data

The K-Modes clustering algorithm [1] has shown great promise for clustering large data sets with categorical attributes. K-Mode clustering algorithm suffers from the drawback of choosing random selection of initial points (modes) of the cluster. Different initial points leads to different cluster formations. In this paper Density-based Multiscale Data Condensation [2] approach with hamming dist...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 25  شماره 

صفحات  -

تاریخ انتشار 2018